Assessing IRT Model-Data Fit for Mixed Format Tests
نویسندگان
چکیده
This study examined various model combinations and calibration procedures for mixed format tests under different item response theory (IRT) models and calibration methods. Using real data sets that consist of both dichotomous and polytomous items, nine possibly applicable IRT model mixtures and two calibration procedures were compared based on traditional and alternative goodnessof-fit statistics. Three dichotomous models and three polytomous models were combined to analyze mixed format test using both simultaneous and separate calibration methods. To assess goodness of fit, The PARSCALE’s G was used. In addition, two fit statistics proposed by Orlando and Thissen (2000) were extended to more general forms to enable the evaluation of fit for mixed format tests. The results of this study indicated that the three parameter logistic model combined with the generalized partial credit model among various IRT model combinations led to the best fit to the given data sets, while the one parameter logistic model had the largest number of misfitting items. In a comparison of three fit statistics, some inconsistencies were found between traditional and new indices for assessing the fit of IRT models to data. This study found that the new indices indicated considerably better model fit than the traditional indices.
منابع مشابه
Practical Consequences of Item Response Theory Model Misfit in the Context of Test Equating with Mixed-Format Test Data
In item response theory (IRT) models, assessing model-data fit is an essential step in IRT calibration. While no general agreement has ever been reached on the best methods or approaches to use for detecting misfit, perhaps the more important comment based upon the research findings is that rarely does the research evaluate IRT misfit by focusing on the practical consequences of misfit. The stu...
متن کاملAnalyzing psychopathology data: a case for nonparametric IRT
Recently, several authors have introduced and discussed the advantage of the use of item response theory models (IRT) to construct personality scales and to explore the structure of personality data sets. For example, Waller, Tellegen, McDonald, and Lykken (1996) contrasted the use of IRT to principal factor analysis and Reise and Waller (in press) discussed the choice of an IRT model to analyz...
متن کاملIdentifying DIF for Latent Classes with the Dirichlet Process
of the Dissertation Identifying DIF for Latent Classes with the Dirichlet Process by Miles Satori Chen Doctor of Philosophy in Statistics University of California, Los Angeles, 2015 Professor Peter Bentler, Chair In Item Response Theory (IRT), Differential Item Functioning (DIF) occurs when individuals who have the same ability, but belong to different groups, have different probabilities of an...
متن کاملplink: An R Package for Linking Mixed-Format Tests Using IRT-Based Methods
This introduction to the R package plink is a (slightly) modified version of Weeks (2010), published in the Journal of Statistical Software. The R package plink has been developed to facilitate the linking of mixed-format tests for multiple groups under a common item design using unidimensional and multidimensional IRT-based methods. This paper presents the capabilities of the package in the co...
متن کاملIRT-FIT: SAS® Macros for Fitting Item Response Theory (IRT) Models
Psychometrics has recently seen the development of complex measurement models to better represent test and item data. Item Response Theory (IRT), in particular, comprises a set of non-linear latent variable models that appear to have several conceptual and empirical properties that make them more valuable in practice than classical test theory methods. However, IRT-based models typically requir...
متن کامل